Skip to content

Use ImportResolver for tag imports#8254

Merged
stevenfontanella merged 8 commits intomainfrom
tag-import
Feb 10, 2026
Merged

Use ImportResolver for tag imports#8254
stevenfontanella merged 8 commits intomainfrom
tag-import

Conversation

@stevenfontanella
Copy link
Member

@stevenfontanella stevenfontanella commented Jan 29, 2026

Part of #8180 and #8261. Fixes the semantics/spec test when the same tag is imported in different instances, in which case the tag should behave as a new identity, which was previously not the case (see the tags in the modified instance.wast in this PR).

@stevenfontanella stevenfontanella changed the title Refactor tag / tag import logic Use ImportResolver for tag imports Feb 3, 2026
@stevenfontanella stevenfontanella marked this pull request as ready for review February 3, 2026 04:43
// internal name.
std::vector<Literals> definedGlobals;
std::vector<std::unique_ptr<RuntimeTable>> definedTables;
std::vector<Tag> definedTags;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need std::unique_ptr for definedTables but not for definedGLobals or definedTags? Or could we get rid of the unique ptr in a separate change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definedTables are unique_ptrs because RuntimeTable is a virtual type. It also makes a little more sense for tables than for globals and tags since tables are more like 'control' while globals and tags are more like plain data (setting aside what we discussed about tags having an identity).

We could potentially remove the indirection by changing it to RealRuntimeTable but it isn't possible right now because we inject a fake RuntimeTable in ctor eval: link. Once we resolve the TODO mentioned there, we could change it. On that note: what's the reasoning for wanting to remove the unique_ptr here? Is it speed or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks. My motivation for asking is that I think it would be nice to be as uniform as possible with how we treat the different kinds of module fields and their corresponding runtime objects here. So I would have expected the vector of defined tags to look very much like the vector of defined tables, i.e. both or neither using unique_ptrs. (This is also why I would expect the tags to be represented by RuntimeTag objects containing pointers to the instantiated Tag definitions, just like RuntimeTable has a pointer to a Table.)

I would go as far as to say that RuntimeTag doesn't need to support an equality operator or do anything except wrap a Tag*. We can tell if two RuntimeTags are the same if they physically have the same address.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is you point that because these are all module fields / importable things that they should all either be unique_ptrs or not unique_ptrs? I feel that they have that in common, but what they don't have in common is whether they encode data or behavior. Literals are closer to data and tables/memories/functions are more of behavior. e.g. for tables we disable copy construction and move assignment, but it's fine to do this with a Literal (and we do do that here). Also it's a little strange (although allowed) to do copy assignment on the pointed-at value of a unique_ptr, which the former code link would do if we changed definedGlobals to hold unique_ptrs.

The rule I'm following is that value objects can be held directly, while behavior objects disable copying/assignment and are held with unique_ptrs. From that perspective it's fine if globals aren't held in a unique_ptr.

I didn't mention tags because they're in an in-between state in this PR. Here I'm using a 'data' object to represent them, but it would mostly be wrong to copy them since it would change its identity. In the next PR I plan to add a type for them that can't be copied, or maybe I'd represent them with unique_ptrs instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mental model, these are all just representations of things held in the WebAssembly store, so they all just hold data. As an additional object oriented C++-ism, we may also give them methods instead of operating on them with free functions, but I don't see the data/behavior distinction you're describing.

This mental model also explains why I think it's natural to determine whether two runtime tags are the same by comparing their addresses (i.e. unique_ptr values). The formal semantics also just compares the addresses of tags when determining whether a catch handler applies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(although I noticed that the spec has a bug here 🤦 WebAssembly/spec#2076)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backing up it sounds like you'd want one of these two options?

  std::vector<std::unique_ptr<Literals>> definedGlobals;
  std::vector<std::unique_ptr<RuntimeTable>> definedTables;
  // ...etc

or

  std::vector<Literals> definedGlobals;
  std::vector<RealRuntimeTable> definedTables;
  // ...etc

I think option 1 is bad because it forces an extra indirection for globals that we don't need, and 2 is bad because:

  1. It requires us to use the concrete class instead of the interface, which is relevant because we have cases that want to overload the handling of tables (ctor-eval does this, and the fuzzer does not but does use ModuleRunner directly so it wouldn't be able to overload a table's behavior without also subclassing ModuleRunnerBase if we use RealRuntimeTable here).
  2. It requires RealRuntimeTable to be movable, and I think it shouldn't be. It can be made movable but ideally tables should be created once and then only referenced, which also matches Wasm's semantics better. We can avoid the need for it to be movable by using std::deque instead, but I think unique_ptr is still more convenient for non-movable types, e.g. a deque can't delete from the middle with non-moveable types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think maintaining the ability to have users of the interpreter provide their own implementations of tables (and maybe memories, etc.) is important, so I would go with using the unique_ptrs. I don't think the extra indirection will make much of a difference in practice, since the import resolver would be giving out raw pointers or references to Literals or RuntimeTable either way.

... but the decision reduces to determining whether adding an unnecessary unique_ptr around the Literals is worth the uniformity of having the unique_ptr for everything. I guess I don't feel too strongly about it, so the current approach sgtm too.

@stevenfontanella stevenfontanella merged commit 4ba48e5 into main Feb 10, 2026
17 checks passed
@stevenfontanella stevenfontanella deleted the tag-import branch February 10, 2026 01:11
@stevenfontanella stevenfontanella linked an issue Feb 10, 2026 that may be closed by this pull request
78 tasks
@kripken
Copy link
Member

kripken commented Feb 10, 2026

The fuzzer is reporting Fatal: getTagOrNull not implemented in ctor-eval.

Do you need a testcase @stevenfontanella , or is that error message already clear enough?

@stevenfontanella
Copy link
Member Author

Weird, I thought this was already a fatal error in ctor-eval:
image. Probably my code made it evaluate this at the wrong time.

Will send a fix soon.

stevenfontanella added a commit that referenced this pull request Feb 10, 2026
Fix for [fuzzer-detected crash when ctor-eval runs on a module that
imports a
tag](#8254 (comment)).
Prior to #8254, ctor-eval would
[crash](https://github.com/WebAssembly/binaryen/blob/23d218d0bd469a399ff17b26fdd71164beeb63fa/src/tools/wasm-ctor-eval.cpp#L396)
when an imported tag was evaluated, but not when imported. Change the
code to allow imported tags even during evaluation.

Note that we can't reason about the identity of imported tags. In the
following code, $t1 and $t2 may be the same or different tags:
```wasm
(import "foo" "bar" (tag $t1))
(import "foo" "bar2" (tag $t2))
```

In this PR, we assume that $t1 and $t2 are different tags, and that
they're the same tag if the import name is the same (this is also not
true in general, the hosting environment may provide two different
values for the same exact import name). This may cause some correctness
issues. As a followup, we can make equality comparison of two imported
tags throw FailToEvalException to make evaluation correct.

Part of #8180.
@stevenfontanella stevenfontanella linked an issue Feb 11, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve interpreter spec test coverage Refactorings to imports in Binaryen

3 participants